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^ABSTRACT 

\ This paper argues that measuring the degree of 

iVplefflentation is important if causative statements about effects of 
an N^Jtinovation are to be made. It identifies four classes of problems 
which impede such measurement: the purpose problem, the local 
adaptability probelm, the scaler pirobljjfem, and th^ innovation 
completeness problem. An understandin.g of these f our* cij-asses of 
problems should lead to better efforts at measuring th^ dfegree of 
ioplementati^on of an innovation by making one consciot^ of: (1) the 
need to specify the purpose for measuring and to facus| work on ^hat 
purpose; (2) the pxobabi J^p-ty^, an^ acceptab ility of local adaptions and 
the, n^ed to hssBss their "appropriateness given the local 
circumstances; (3) the futility of a single measure and the 
advantages of a profile^in measuring degree qf implementation; and 
(4) that responsibility for outcomes of the use of an innovation 
rests boxh with the developer and the implem^ntor. ^Author/DEP) . 
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The rationale for the importance of measuring the degree of inplementa- 
tion of an innovation has been made at least implicitly through c-nnnicnts in 
this ses-sion and trhxough vrritin^s about the research and evaluation process. 
That rationale is the basis for "process evaluation" in^he CIPP evaluation 
'model V(Stufflebeam, et al . 1971) and f^^r "implementation eyaluati-^n" in the 
evaluation model described by the UCLA Center for the Study of Ev ■> lua tl'^n 
(Klein, et al. 1971). Briefly stated, that rationale says that if you don * t 
know what happened to a group on v;hic^^ you have outcome measures, vou can- 
not explain what caused the observed ^utcomefe. A pretest and a P'>c;ttest by 
t\)emselves are Insufficient. Ve know full Well that frequently th^ exi^eri- 
ences we want to occur between them are not carried out as planned. In 
fact, in some instances I * ve heard of, those experiences were not oven .niti- 
aterd. , . .\ ^ 

A change in the intended experiences (read treatments) is often fatal 

A 

to a research effort for such a change makes the independent, variable tested 

different from the one that the review of literature indicates needs to be 

I ; " < - 

'tested. It is just as fatal to an evaluation. Her^ a failure of the staff 

to carry out the intended experiences (read sometimes as an innovation) that 

goes unde^t(?cted by the ^valuator results in mis-infdrmation delivri-ed to. 

the decision maker. For example, if a set of cJrasses is involved' In an eval- . 

uation of the experience chart approach to teaching reading and the teachers 
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involved seldom elicit stories from the children, the pre- and pn^^ttest diffc 
ences in means .will not help the^ decision maker kno^; hov^ effective that inno- 
^vation is in his or hej: setting. Measuring the degree of implemenlation of 
an- innovation-^ is required if we hope to say anything about its effects or its 
wort>h. ' ' . . ^ . 
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It is much easier to say we must measure the degree of implorr^nfat ' ( 
than it is to do it. At least"" four cfla-sses of problefms abound in such 
efforts. Those four are the purpose problem, multiple scale problem, a local 
adaptibility problem, and an innovation completeness problem. T'^o re- 
ma-inder of this presentation will attempt , to descrilje the nature '^^ those 
probleras and the co^.plications they create. 

The Purpose Problem 

What is our purpose? What accomplis^hment will^ we contribute to by mea- 

syring the degree of implementation of an innovation? Ralph Tyler (196^) 

has 14^ted and described a half d^ozen general meansurem'ent purpose^; and 

stronglyripr)rr^^M;hat t'r.e measurement procedures will vary depending on the 

— «4 . ^ 

purpose being serve'd. T\\e maxim, form follows function, applies l'<^re. The 

sane measurement forn'will not accomplish all functions. If we wa;\t to diag-- 

nose dnplenentation difficulties we v;ould use a different measurer^nt proce- 

dure than if our function is to provide information useful in an p.dopt/ .dapt/ 

reject decision. The corollary of that maxim also applies- functi^n fellows 

^.form. That is, if we use the same measurement procedures, for all pj^rpo-^es, 

♦ we wiai not efficiently and effectively serve all the purposes. Rather we 

will serve only, one purpose. 

The importance of function or purpos^.^' recognized in^Nad:^r's work 

(Madl^r 6 Gephari, 1972) on the design or development process. Tntil the 
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purpose or function to be seifved is specified, it is a mistake to design 
or select tools or procedures for doirtg the work. Nadler operates on the 
maxim, function is first. ^ , 

Cuba (personal conversation) give^^ome very practical ^elp^ related 
to finding and stating our purpose that is applicable in measuring the 
degree of implementation of an innovation. He suggests that you put your- 
self into the futyre and pretend that the measurement has been completed 
and that it did the job needed. Now, Cuba suggests, answer the question, 
"What has been accomplished as a result of measuring the degree of imple- 
mentation of the innovation?" The answer to that question usually is your 

purpose, "'---^ j{ 

Before marching off to measure and serve that purpose, Nadler (Nadlet 
&'Gephart, 1972) implores us to check first to see that the purpose needs 
to be served . He urges the d^evelopment of a purpose or function hierarchy. 
This is done by starting with the pUjrpose identified through Cuba's ques- 
tion; pretending that function has been accoju^l^ished ; asking what higher . 
purpose v/ould be our concern; and recycling this questioning until the hier 
archy is extended as far as possible. Nadler sayS that the appropriate 
function or' purpose on v/hich to focus is the one that is least restrictive- 
or limiting to the sys.tem.^ . ^ ■ 

The purpose problem is an impediment to measuring the degree of im- 
plementation of* an innovation. From th6 -discussion above it would seem 
it manifest itself in the form of: (1) failure to ^delineate the purpose 
or function to be served by" the raeasuremeat effort; and (2) failure to use 
measuring procedures appropriate to that purpose . 
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The Cocal Adaptability PrQblem ^ 

— - ■ [ 

The second problem in measuring the degree of ipipl^tnentation seems 
to this observer to have resulted from a myth now erjbraced by educators 
and the American public. That is the myth of the teacher-proof system. 
Lots of people have contributed to this. myth. Some of us have wished for 
the teaching machine that would do the perfect instruction job. Others 
6f us have tried to develop them. Still others (noteably some of our * 
bureaucratic leaders in Washington) have demanded that we "validate . the 
transportability" of educational products we create. 

This myth of universal applicability (another way of saying validated 
as transportable) is_^ foil;?, utter foilyl Nothing that has been created 

V 

in the history of American education has been shown to be universally 
applicable. No text, nb teaching machine, no te^t,.^o teaching procedure 
has accomplished this feat. We have enoujgh of a burden^T?y^ng to create 




tools and procedures ttiat will work in^ the variability found in on$ set- 
ting, let alone req^uiring that our creations be capable of meeting the 
situational variables in all settings! 

This objection "to the myth of universal applicability is not simply 
the frus/trated moan of an unsuccessful developer. A specific product 
or procedure is developed for a particular purpose or function. And 
typically >^ purposes or functions differ from setting to setting. Nadler 
(Nadler & Gephajrt^^^ 1972) reports 'being asked to assist a hospital staff 
'develop a medical records library system. They created one, that greatly 
satisfied the hospital -staff. Three months later Nadler had an identi- 
cal request from another hospital. Most people would use the system 
developed three months earlier. Nadler did not. Rather*, he employed 
design process as if the first medical records^ library work had not 
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been done. In w6fK±Ttg--witk,Jthe second hospital, it was quickly apparent 
that, the function or purpose for the medical record library system was not 
the same from*tpne hospital to the other. Thus, a somewhat different medical 
record library system was designed for the second hospital. 

Some areas of the business and manufacturing world recognize that for 
a product (or procedure) to be effective it must be locally adaptable. ^ For 
example, when you purchase an automobile you exercise a number of options 
to fit the vehicle to your purposes and desires. Still other "local adapta- 
tions" are involved as indicated in a charge you pay called "Dealer set-up 
charges." This includes, for example, adjusting the fuel-air-spark para- 
meters so that an automobile assembled in Detroit (500 feet above «sea level) 
> qan operate efficiently in Denver (5,000 feet above sea level). The same 
"^4§ptability can be seen built iyito other products, desks come with some 
adJustmerv^Mio the length of the legs, so that accomodation' to uneven floors 
is possibl^^^ ^esk chairs come with a height adjustment, etc. 

Innovations' that have any complexity are systems with numerous com- 
ponent parts. The ideal systein v;ould be one which has the needed number 
and type of components univ^^E^ally required and the- needed number and type 
of component parts that would permit the local adaptation required to fit 
the difference in purpose found in the settings in which it would be used. 
Achieving that perfectly adaptable product or procedure is unlikely, however. 
First, we seldom know enough in a design effort to create all the needed 
component parts. (System analysis people refer to this as the degree* of 
system wholeness . ) As a result, we "patch-around" unavailable^components . 
Second, the -^o^ procedure to be created, if it tjas any complexity, has 
a set of required knowledge , and skills for its effective operation^ Thus, 
the knowledge and skills possessed or easily developed by the personnel who 
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will use' the innovation become an uppter limit for a use in a spetific set- 
ting. Because of these tvd'^ factors , any given product or procedure is 
modified as it is used. 

Given the need for and fact of local modification of an innovation, 
the measurement of degree of implementation is complicated. The problem 
becomes one of defining anticipated, actual, and appropriate adaptation. 
Preset and rigidly structured measuring techniques cannot be used effect- 
ively in situations in which flexibility and modif lability are the rule. 

When evaluation generalizability is sought (and measuring the degree 
of implementation is some instances is process evaluation), our efforts 
should be patterned on the consumer products model as illustrated in the 
continuing work of the Consumers' Union. ( Consumers Report ; Gephart & 
Potter, 1976) ^Consumers Report shows attention to two types ' of decision 
appropriate information. The first type consists of those questions for 
which there is no correct answer (for example, do you want to buy a car in 
which you '^11 "feel the road" or in which you will float?), questions 
settled by personal values or situational conditions. The gendr alizable 
evaluation report (and thus,, the measurement of the degree of implemen- 
tation) should alert those interested in the innovation to the set of 
questions on which they have options to exercise. The second type of 
ififormation consists of those items that are constants, that are not 
situationally variable. Both types of information are necessary to com- 
municate about an innovation's worth to potential users. 
* 

, The Multiple Scale Problem . ' 

As indicated earlier, any innovation that has some complexity is a 

system with numerous component parts. Assessing the degree jof implem- 
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entation requires observations or measurements on a number, if not all, 
of these components. Some of these components vTill be observable in 
categories yielding nominal data (for exanrple, were all the necessary 
kinds of equipment — desks, chairs, books, etc. — assembled before 
the innovation?). Still other components of the innovation will be of 
such a nature that they may be measured in ratio scales (for example, 
how much time was devoted to component X of the innovation?). Ordinal 
and interval measures are also likely to be involved if the innovation 
is complex. 

The use of different measurement scales creates a difficulty in 
summarizing data. It is conceptually impossible to combine nominal, 
ordinal, interval, and ratio data wi^lthout losing some information . 
Thus it is impossible to get a single score that clearly describes the 
degree of implementation of an innoj^ation if different scales of measure-^ 
ment are involved. It is , rela tiveli^ easy to get a score or scores on the 
various components and thus to preSt'ent a profile of the degree of implemen- 
tation. But combining .those profij^ items to a single summary descriptor 
,Qf the degree of implementation requires value Judgment abaut the relative 
value or weighting for the different items in the profile. . And, as we all 
know, value judgments vary from .person to person thus creating differences 
in perci^tion M the overall quality of a ^gi\ren profile. 

Given the statements made earler related to the need for adaptability 
in educational products and procedures, a profile would seem to be a more 
logical and beneficial vay of colnmuni eating about implementation, of an 

innovation^ than a single score. 

\ • - ► 

. ' \ ' . - ^ ' 
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The Innovation Completeness Pro'blem 

Frequently a new product or procedure in education appears complete 
to its developer but proves to be incomplete in another setting. This 
.incompleteness is the fourth category of problems in measuring the degree 
of xtnplement ation of. an innovation. Incompleteness of one soft has al- 
ready been alluded to in this presentation. That is incompleteness that 
occurs due to our lack of knowledge or ability to cx'eate some of the com- 
ponents of the ideal product or procedure. As'^a result, developers "patch 
around" the missing components to create a feasible product or procedure. 
Developers cannot be faulted for this type of incompleteness. We canftot ask 

that development efforts be suspended until all the necessary knowledge or 

* 

ability to create is in. It does stand as a developer fault, however, if 
potential users of the. innovation are not aLsrted in advance regarding, the 
points of incpmple teness. f, 

A second form or source of incompleteness is more subtle and invidious. 
That is the indispensible person problem. Developers have created products 
and procedures which work when they are involved but not when it is turned 
over to someone else. ^ In such cases the crucial element in the innovation 
seems >to be the style of operation of the indispensible person. For example, 
several y ears| ago a group was told V hat an individual had created the perfect 
"way of teaching reading. After some discussion it was le^irned that the pre- 
dicted ^success had been demonstrated vl^^n , and only when, the developed was 
the teacher. To the credit of the devel>vper, his role as a part of the in- 
noyatiorv was recognized. Others are not that observant." 

One of the questions that s.hould*be asjked about an innovation .then 
is, "What are its points of incompleteness?" This ^is perhaps J^t ej^^ressed 
in the languaf,e of the systems analysts. S4ich people consider a tool or^ 



procedure a system and the components of it as subsystems. In gen^eral 
systems theory, it Is readily accepted that all of the subsystems inter-* 
face (interlock or connect) in a manner which maximally serve the overall 
system* s function. Systems analysts speak of the concept of wholeness in 
*this respect. ' ' • ■ 

Two techniques which help pinpoint innovation incomple teftess are 

\ 

flow-charting and PERT diagraming. These are aides both to the ^practioneif 
and the person cT^arged with measuring the degree of implementation. Flow- 
charting involves the development of a chart that sequences and inter- ' < 
relates actions of various aorts and decision points. ^ A variety of geo- 
metric figures are involved and their different shapes present relevant 
meaning^. (For example, a rectangle generally, indicates some kind of ac^iV- 
ity; diank)ndjs represent decisions ^Ai specify a set of alternatives; circles " 
are connectors, etc. Templates for these symbgls are available at most 
drafting sup{3liers.9 PERT is the acronym for Program Evaluation and Review 
'^Technique. PI^^X is an analysis and review procedure created as a manage- 
ment tool for the Office ^of Naval Research at the time the Polaris Jlissile 
System was being created. > Cook (1965) a§ descr\bed PEKT and its applicability 
to education. Central to PERT is tlie creation of what is called a PERT chart " 
in which arrows represent activity and circles* represen t events. The -PERT , 
chart shov/s the sequence and interrelation or inter dependencies of events and 
subsequent activities. If either a flowchart or a PERT chart is made for an 
.innovation, the points o,f incompleteness are more liable to be observe<^ than 
if ct purely verbal description is^^'presented . • » ' , 

Incompleteness' in an innovation hampers measurement^ of the^degree of 
' itnpl'etnentat ion if it Is undetected , tn those cases the' measurer assumes 
that .event A will follow .act ivity A which will be followed by activity B 
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and event B, etc. A point of incompleteness invalidates the assumption. 
By assuming that the. innovation is compL^ste when it isn*t> we shift the 
responsibility for discrepancy between what s^e innovation is claimed to 
accomplish and v;hat it does accomplish from the 'developer to the imple- 
menter, a shift that is not logically warranted. 

\ 

\ 

Summary \ 

, This paper has argued that meas^uring the degree of implementation is 
important if we want to make causative statements about effects of an in- 
novation. It has identified four classes of problems which impede such 
measurement: the purpose problem^ the local adaptability problem,\ the 
scalor prob]^em, and the innovation completeness problem. An understanding 
of these four classes of problems should lead to Jbetter efforts at measur- 
ing the degree of implementation of an innovation by itiaking us conscious 

(1) the need to specify our purpose for measuring i and to focus our 
work on that purpose; (2) the probability and acceptabil|-ity of local adap- 
tibhs and the need to assess their appropriateness given the local circum- 
Stances; (3) the futility of^ a single measure and the anvantages of a profile 
in measuring degree of implementation; and (4) that responsibility for Out- 
comes of th6 use of an innovation rests both with the developer and the 
implementor. 
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of complexity, a theory should reflect this knowledge. ' 
Research into the* factorial complexity of APT forms would 
^ also contribute to theory development. 

8. There is a need for creative development of new forms 
of APT that may alleviate some of the measurement short- 
comings' that have been discussed. Educational measurement 
specialists funded to exolore such creative .alternatives 
would contribute new knowledge that would 'Have immediate 
use for public school testing. 
Applied Performance Testing has great appeal for measuring .task^ 
performance in the public schools. There is much work to be done to refine 
the concept and improve on our techniques.^ I be.lieve the effort is worthwhile 
and expect to see comparatively great advances in APT in the near future. 
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